Japanese Zero Pronoun Resolution based on Ranking Rules and Machine Learning
نویسندگان
چکیده
Anaphora resolution is one of the most important research topics in Natural Language Processing. In English, overt pronouns such as she and definite noun phrases such as the company are anaphors that refer to preceding entities (antecedents). In Japanese, anaphors are often omitted, and these omissions are called zero pronouns. There are two major approaches to zero pronoun resolution: the heuristic approach and the machine learning approach. Since we have to take various factors into consideration, it is difficult to find a good combination of heuristic rules. Therefore, the machine learning approach is attractive, but it requires a large amount of training data. In this paper, we propose a method that combines ranking rules and machine learning. The ranking rules are simple and effective, while machine learning can take more factors into account. From the results of our experiments, this combination gives better performance than either of the two previous approaches.
منابع مشابه
Utilizing Features of Verbs in Statistical Zero Pronoun Resolution for Japanese Speech
This paper proposes a statistical zero pronoun resolution method that utilizes features of verbs. In Japanese speech, the subject is often omitted, especially when it is the first person. To resolve such zero pronouns, features related to the verbs such as functional expressions play important roles. However, recent state-of-the-art zero-pronoun resolution systems lack these features because th...
متن کاملZero Pronoun Resolution can Improve the Quality of J-E Translation
In Japanese, particularly, spoken Japanese, subjective, objective and possessive cases are very often omitted. Such Japanese sentences are often translated by Japanese-English statistical machine translation to the English sentence whose subjective, objective and possessive cases are omitted, and it causes to decrease the quality of translation. We performed experiments of J-E phrase based tran...
متن کاملCapturing Salience with a Trainable Cache Model for Zero-anaphora Resolution
This paper explores how to apply the notion of caching introduced by Walker (1996) to the task of zero-anaphora resolution. We propose a machine learning-based implementation of a cache model to reduce the computational cost of identifying an antecedent. Our empirical evaluation with Japanese newspaper articles shows that the number of candidate antecedents for each zero-pronoun can be dramatic...
متن کاملA Deep Neural Network for Chinese Zero Pronoun Resolution
This paper investigates the problem of Chinese zero pronoun resolution. Most existing approaches are based on machine learning algorithms, using hand-crafted features, which is labor-intensive. Moreover, semantic information that is essential in the resolution of noun phrases has not been addressed enough by previous approaches on zero pronoun resolution. This is because that zero pronouns have...
متن کاملIdentification and Resolution of Chinese Zero Pronouns: A Machine Learning Approach
In this paper, we present a machine learning approach to the identification and resolution of Chinese anaphoric zero pronouns. We perform both identification and resolution automatically, with two sets of easily computable features. Experimental results show that our proposed learning approach achieves anaphoric zero pronoun resolution accuracy comparable to a previous state-ofthe-art, heuristi...
متن کامل